Determining Native Language and Deception Using Phonetic Features and Classifier Combination

نویسندگان

Gábor Gosztolya

Tamás Grósz

Róbert Busa-Fekete

László Tóth

چکیده

For several years, the Interspeech ComParE Challenge has focused on paralinguistic tasks of various kinds. In this paper we focus on the Native Language and the Deception subchallenges of ComParE 2016, where the goal is to identify the native language of the speaker, and to recognize deceptive speech. As both tasks can be treated as classification ones, we experiment with several state-of-the-art machine learning methods (Support-Vector Machines, AdaBoost.MH and Deep Neural Networks), and also test a simple-yet-robust combination method. Furthermore, we will assume that the native language of the speaker affects the pronunciation of specific phonemes in the language he is currently using. To exploit this, we extract phonetic features for the Native Language task. Moreover, for the Deception Sub-Challenge we compensate for the highly unbalanced class distribution by instance re-sampling. With these techniques we are able to significantly outperform the baseline SVM on the unpublished test set.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

مقایسه روش های طیفی برای شناسایی زبان گفتاری

Identifying spoken language automatically is to identify a language from the speech signal. Language identification systems can be divided into two categories, spectral-based methods and phonetic-based methods. In the former, short-time characteristics of speech spectrum are extracted as a multi-dimensional vector. The statistical model of these features is then obtained for each language. The ...

متن کامل

Feature Space Selection and Combination for Native Language Identification

We decribe the submissions made by the National Research Council Canada to the Native Language Identification (NLI) shared task. Our submissions rely on a Support Vector Machine classifier, various feature spaces using a variety of lexical, spelling, and syntactic features, and on a simple model combination strategy relying on a majority vote between classifiers. Somewhat surprisingly, a classi...

متن کامل

Native Language Identification using Phonetic Algorithms

In this paper, we discuss the results of the IUCL system in the NLI Shared Task 2017. For our system, we explore a variety of phonetic algorithms to generate features for Native Language Identification. These features are contrasted with one of the most successful type of features in NLI, character n-grams. We find that although phonetic features do not perform as well as character n-grams alon...

متن کامل

Language- and Talker-dependent Variation in Global Features of Native and Non-native Speech

We motivate and present a corpus of scripted and spontaneous speech in both the native and the non-native language of talkers from various language backgrounds. Using corpus recordings from 11 native English and 11 late Mandarin-English bilinguals we compared speech timing across native English, native Mandarin, and Mandarin-accented English. Findings showed similarities across native Mandarin ...

متن کامل

Identifying Individual Differences in Gender, Ethnicity, and Personality from Dialogue for Deception Detection

When automatically detecting deception, it is important to model individual differences across speakers. We explore the automatic identification of individual traits such as gender, native language, and personality, using acoustic-prosodic and lexical features from an initial non-deceptive dialogue. We also explore predicting success at deception and at deception detection, using the same featu...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Determining Native Language and Deception Using Phonetic Features and Classifier Combination

نویسندگان

چکیده

منابع مشابه

مقایسه روش های طیفی برای شناسایی زبان گفتاری

Feature Space Selection and Combination for Native Language Identification

Native Language Identification using Phonetic Algorithms

Language- and Talker-dependent Variation in Global Features of Native and Non-native Speech

Identifying Individual Differences in Gender, Ethnicity, and Personality from Dialogue for Deception Detection

عنوان ژورنال:

اشتراک گذاری